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Abstract 

The F.A.S.T. model for microscopic simulation of pedestrians was 
formulated with the idea of parallelizability and small computation 
times in general in mind, but so far it was never demonstrated, if it 
can in fact be implemented efficiently for execution on a multi-core or 
multi-CPU system. In this contribution results are given on computa- 
tion times for the F.A.S.T. model on an eight-core PC. 

1 Introduction 

Pedestrians and vehicles alike are extended objects in space. For simulation 
models [1-6] this means that one has to guarantee a mutual exclusion vol- 
ume, i.e. that they do not overlapp. This exclusion volume includes more 
than the mere body or vehicle, but as well a headway whose size increases 
monotonically with speed [7, 8]. This can be achieved in at least two ways: 
either all agents compute their next movement step in parallel and poten- 
tially emerging conflicts about exclusion volumes are solved afterwards [9], 
or the movement is done sequentially: a strategy with which it's easy to 
prevent conflicts generally. However, the kind of update procedure has a 
strong influence on the dynamics of the system [9-11]. Obviously parallel 
update fits better for parallel computing attempts and as a lucky coinci- 
dence parallel update has proven to usually yield more realistic results when 
physical systems are simulated than sequential update [12, 13]. 

The F.A.S.T. model [14-19] tries to make use of both strategies, as 
the planning process for the next movement step is done in parallel and 
the actual movement sequentially to avoid a computationally costly conflict 
resolution lateron. Additionally computationally costly calculations (like 



exponential or trigonometric functions) are only made use of in the planning 
process, while actual movement only consists of very simple commands and 
calculations. 

So, each time step consists of a computationally rather expensive plan- 
ning phase, where data common to all agents is only read (easy to parallelize) 
and a computationally cheaper actual motion phase, difficult to parallelize, 
as it writes to data structures common for all agents. In this contribution re- 
sults of measurements of the computation time of a first parallelized version 
of the algorithm are given. 

2 Technical Details 

The parameters of the F.A.S.T. model were chosen to be k$ = 1.2 and 
k ther = 0. In a simulation all agents had the same maximum speed. All 
calculations were done for all maximum speeds v m = 1 to v m = 5, but as 
with v m = 4 the fundamental diagram of Weidmann [20, 21] is reproduced 
quite well [17], the focus in the following results section is on w m = 4. 

The computation times given in the next section refer to the simulation 
of 396 simulation time steps equivalating to 396 simulated seconds. 

The simulations were carried out on a PC with two Xeon E5320 quadcore 
processors and 20 GB RAM. The parallelization was done using OpenMP 
[22], and the source code was compiled using the Visual C+- 1- 8 (Visual 
Studio 2005) compiler. 

Parallel computing was made use of only in the process of choosing a 
desired cell for each agent: 

blocksize=max(min( number _of_agents /cores ,32767) ,1); 
^pragma omp parallel num.threads ( cores ) 

{ 

^pragma omp for schedule (dynamic, blocksize) 
for (int i = 0; i < number _of .agents ; i++) 
choose_desired_cell ( i ) ; 

} 

3 Results and Conclusions 

One of the main results from figures 1 to 7 is that with a parameter config- 
uration that reproduces Weidmann's fundamental diagram fairly well, and 
using all eight cores, a real time computation speed could be achieved when 
about 182,000 agents were in the simulation simultaneously. However, sim- 
ulating more complex situations like counterflow or non-trivial route choice 
would require additional calculations and reduce simulation speed. 
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Figure 1 : Number of agents that can be simulated in real time in dependence 
of number of cores. 
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Figure 2: Computation time in dependence of number of agents at fixed 
maximum speed v m = 4. 
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Figure 3: Computation speed factor in dependence of number of cores at 
fixed maximum speed v m = 4. 
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Figure 4: Computation speed factor in dependence of number of agents 
when eight cores are used (interpolation with splines). 




Figure 5: Computation time in dependence of maximum speed for 40,000 
agents. 
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Figure 6: Computation time in dependence of number of agents for 8 cores. 
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Computation speed factor (Scores) © 40,000 agents 




Figure 7: Computation speed factor in dependence of number of cores for 
40,000 agents. 

The computation speed factors with eight cores compared to using only 
one core of the same PC were found to be in the range 4.3 to 5.0. This 
confirms that the initial intention is met to have a model suited for parallel 
computation. 

The wide range of factors in figure ! may be a hint that there might be 
more efficient partitions of the parallel calculation parts. 

Apart from the model efficiently making use of a high number of cores, 
Intel has released a CPU (Xeon 5482) with a - depending on the kind of 
computation - 20% to 50% higher performance. If it is possible to make 
use of this performance increase, a real-time simulation could be achieved 
with well beyond 200,000 agents; a stadium size (40,000 agents) evacuation 
simulation in 5% to 15% of real time, as evacuation always implies that 
the average number of active agents during the course of the simulation is 
roughly half the initial number. 
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